-
1.
A detailed genome-scale metabolic model of Clostridium thermocellum investigates sources of pyrophosphate for driving glycolysis.
Schroeder, WL, Kuil, T, van Maris, AJA, Olson, DG, Lynd, LR, Maranas, CD
Metabolic engineering. 2023;:306-322
Abstract
Lignocellulosic biomass is an abundant and renewable source of carbon for chemical manufacturing, yet it is cumbersome in conventional processes. A promising, and increasingly studied, candidate for lignocellulose bioprocessing is the thermophilic anaerobe Clostridium thermocellum given its potential to produce ethanol, organic acids, and hydrogen gas from lignocellulosic biomass under high substrate loading. Possessing an atypical glycolytic pathway which substitutes GTP or pyrophosphate (PPi) for ATP in some steps, including in the energy-investment phase, identification, and manipulation of PPi sources are key to engineering its metabolism. Previous efforts to identify the primary pyrophosphate have been unsuccessful. Here, we explore pyrophosphate metabolism through reconstructing, updating, and analyzing a new genome-scale stoichiometric model for C. thermocellum, iCTH669. Hundreds of changes to the former GEM, iCBI655, including correcting cofactor usages, addressing charge and elemental balance, standardizing biomass composition, and incorporating the latest experimental evidence led to a MEMOTE score improvement to 94%. We found agreement of iCTH669 model predictions across all available fermentation and biomass yield datasets. The feasibility of hundreds of PPi synthesis routes, newly identified and previously proposed, were assessed through the lens of the iCTH669 model including biomass synthesis, tRNA synthesis, newly identified sources, and previously proposed PPi-generating cycles. In all cases, the metabolic cost of PPi synthesis is at best equivalent to investment of one ATP suggesting no direct energetic advantage for the cofactor substitution in C. thermocellum. Even though no unique source of PPi could be gleaned by the model, by combining with gene expression data two most likely scenarios emerge. First, previously investigated PPi sources likely account for most PPi production in wild-type strains. Second, alternate metabolic routes as encoded by iCTH669 can collectively maintain PPi levels even when previously investigated synthesis cycles are disrupted. Model iCTH669 is available at github.com/maranasgroup/iCTH669.
-
2.
De novo design and Rosetta-based assessment of high-affinity antibody variable regions (Fv) against the SARS-CoV-2 spike receptor binding domain (RBD).
Boorla, VS, Chowdhury, R, Ramasubramanian, R, Ameglio, B, Frick, R, Gray, JJ, Maranas, CD
Proteins. 2023;(2):196-208
-
-
Free full text
-
Abstract
The continued emergence of new SARS-CoV-2 variants has accentuated the growing need for fast and reliable methods for the design of potentially neutralizing antibodies (Abs) to counter immune evasion by the virus. Here, we report on the de novo computational design of high-affinity Ab variable regions (Fv) through the recombination of VDJ genes targeting the most solvent-exposed hACE2-binding residues of the SARS-CoV-2 spike receptor binding domain (RBD) protein using the software tool OptMAVEn-2.0. Subsequently, we carried out computational affinity maturation of the designed variable regions through amino acid substitutions for improved binding with the target epitope. Immunogenicity of designs was restricted by preferring designs that match sequences from a 9-mer library of "human Abs" based on a human string content score. We generated 106 different antibody designs and reported in detail on the top five that trade-off the greatest computational binding affinity for the RBD with human string content scores. We further describe computational evaluation of the top five designs produced by OptMAVEn-2.0 using a Rosetta-based approach. We used Rosetta SnugDock for local docking of the designs to evaluate their potential to bind the spike RBD and performed "forward folding" with DeepAb to assess their potential to fold into the designed structures. Ultimately, our results identified one designed Ab variable region, P1.D1, as a particularly promising candidate for experimental testing. This effort puts forth a computational workflow for the de novo design and evaluation of Abs that can quickly be adapted to target spike epitopes of emerging SARS-CoV-2 variants or other antigenic targets.
-
3.
Dissecting the metabolic reprogramming of maize root under nitrogen-deficient stress conditions.
Chowdhury, NB, Schroeder, WL, Sarkar, D, Amiour, N, Quilleré, I, Hirel, B, Maranas, CD, Saha, R
Journal of experimental botany. 2022;(1):275-291
Abstract
The growth and development of maize (Zea mays L.) largely depends on its nutrient uptake through the root. Hence, studying its growth, response, and associated metabolic reprogramming to stress conditions is becoming an important research direction. A genome-scale metabolic model (GSM) for the maize root was developed to study its metabolic reprogramming under nitrogen stress conditions. The model was reconstructed based on the available information from KEGG, UniProt, and MaizeCyc. Transcriptomics data derived from the roots of hydroponically grown maize plants were used to incorporate regulatory constraints in the model and simulate nitrogen-non-limiting (N+) and nitrogen-deficient (N-) condition. Model-predicted flux-sum variability analysis achieved 70% accuracy compared with the experimental change of metabolite levels. In addition to predicting important metabolic reprogramming in central carbon, fatty acid, amino acid, and other secondary metabolism, maize root GSM predicted several metabolites (l-methionine, l-asparagine, l-lysine, cholesterol, and l-pipecolate) playing a regulatory role in the root biomass growth. Furthermore, this study revealed eight phosphatidylcholine and phosphatidylglycerol metabolites which, even though not coupled with biomass production, played a key role in the increased biomass production under N-deficient conditions. Overall, the omics-integrated GSM provides a promising tool to facilitate stress condition analysis for maize root and engineer better stress-tolerant maize genotypes.
-
4.
IPRO+/-: Computational Protein Design Tool Allowing for Insertions and Deletions.
Chowdhury, R, Grisewood, MJ, Boorla, VS, Yan, Q, Pfleger, BF, Maranas, CD
Structure (London, England : 1993). 2020;(12):1344-1357.e4
Abstract
Insertions and deletions (indels) in protein sequences alter the residue spacing along the polypeptide backbone and consequently open up possibilities for tuning protein function in a way that is inaccessible by amino acid substitution alone. We describe an optimization-based computational protein redesign approach centered around predicting beneficial combinations of indels along with substitutions and also obtain putative substrate-docked structures for these protein variants. This modified algorithmic capability would be of interest for enzyme engineering and broadly inform other protein design tasks. We highlight this capability by (1) identifying active variants of a bacterial thioesterase enzyme ('TesA) with experimental corroboration, (2) recapitulating existing active TEM-1 β-Lactamase sequences of different sizes, and (3) identifying shorter 4-Coumarate:CoA ligases with enhanced in vitro activities toward non-native substrates. A separate PyRosetta-based open-source tool, Indel-Maker (http://www.maranasgroup.com/software.htm), has also been created to construct computational models of user-defined protein variants with specific indels and substitutions.
-
5.
SNPeffect: identifying functional roles of SNPs using metabolic networks.
Sarkar, D, Maranas, CD
The Plant journal : for cell and molecular biology. 2020;(2):512-531
-
-
Free full text
-
Abstract
Genetic sources of phenotypic variation have been a focus of plant studies aimed at improving agricultural yield and understanding adaptive processes. Genome-wide association studies identify the genetic background behind a trait by examining associations between phenotypes and single-nucleotide polymorphisms (SNPs). Although such studies are common, biological interpretation of the results remains a challenge; especially due to the confounding nature of population structure and the systematic biases thus introduced. Here, we propose a complementary analysis (SNPeffect) that offers putative genotype-to-phenotype mechanistic interpretations by integrating biochemical knowledge encoded in metabolic models. SNPeffect is used to explain differential growth rate and metabolite accumulation in A. thaliana and P. trichocarpa accessions as the outcome of SNPs in enzyme-coding genes. To this end, we also constructed a genome-scale metabolic model for Populus trichocarpa, the first for a perennial woody tree. As expected, our results indicate that growth is a complex polygenic trait governed by carbon and energy partitioning. The predicted set of functional SNPs in both species are associated with experimentally characterized growth-determining genes and also suggest putative ones. Functional SNPs were found in pathways such as amino acid metabolism, nucleotide biosynthesis, and cellulose and lignin biosynthesis, in line with breeding strategies that target pathways governing carbon and energy partition.
-
6.
A comprehensive genome-scale model for Rhodosporidium toruloides IFO0880 accounting for functional genomics and phenotypic data.
Dinh, HV, Suthers, PF, Chan, SHJ, Shen, Y, Xiao, T, Deewan, A, Jagtap, SS, Zhao, H, Rao, CV, Rabinowitz, JD, et al
Metabolic engineering communications. 2019;:e00101
Abstract
Rhodosporidium toruloides is a red, basidiomycetes yeast that can accumulate a large amount of lipids and produce carotenoids. To better assess this non-model yeast's metabolic capabilities, we reconstructed a genome-scale model of R. toruloides IFO0880's metabolic network (iRhto1108) accounting for 2204 reactions, 1985 metabolites and 1108 genes. In this work, we integrated and supplemented the current knowledge with in-house generated biomass composition and experimental measurements pertaining to the organism's metabolic capabilities. Predictions of genotype-phenotype relations were improved through manual curation of gene-protein-reaction rules for 543 reactions leading to correct recapitulations of 84.5% of gene essentiality data (sensitivity of 94.3% and specificity of 53.8%). Organism-specific macromolecular composition and ATP maintenance requirements were experimentally measured for two separate growth conditions: (i) carbon and (ii) nitrogen limitations. Overall, iRhto1108 reproduced R. toruloides's utilization capabilities for 18 alternate substrates, matched measured wild-type growth yield, and recapitulated the viability of 772 out of 819 deletion mutants. As a demonstration to the model's fidelity in guiding engineering interventions, the OptForce procedure was applied on iRhto1108 for triacylglycerol overproduction. Suggested interventions recapitulated many of the previous successful implementations of genetic modifications and put forth a few new ones.
-
7.
Development of a core Clostridium thermocellum kinetic metabolic model consistent with multiple genetic perturbations.
Dash, S, Khodayari, A, Zhou, J, Holwerda, EK, Olson, DG, Lynd, LR, Maranas, CD
Biotechnology for biofuels. 2017;:108
Abstract
BACKGROUND Clostridium thermocellum is a Gram-positive anaerobe with the ability to hydrolyze and metabolize cellulose into biofuels such as ethanol, making it an attractive candidate for consolidated bioprocessing (CBP). At present, metabolic engineering in C. thermocellum is hindered due to the incomplete description of its metabolic repertoire and regulation within a predictive metabolic model. Genome-scale metabolic (GSM) models augmented with kinetic models of metabolism have been shown to be effective at recapitulating perturbed metabolic phenotypes. RESULTS In this effort, we first update a second-generation genome-scale metabolic model (iCth446) for C. thermocellum by correcting cofactor dependencies, restoring elemental and charge balances, and updating GAM and NGAM values to improve phenotype predictions. The iCth446 model is next used as a scaffold to develop a core kinetic model (k-ctherm118) of the C. thermocellum central metabolism using the Ensemble Modeling (EM) paradigm. Model parameterization is carried out by simultaneously imposing fermentation yield data in lactate, malate, acetate, and hydrogen production pathways for 19 measured metabolites spanning a library of 19 distinct single and multiple gene knockout mutants along with 18 intracellular metabolite concentration data for a Δgldh mutant and ten experimentally measured Michaelis-Menten kinetic parameters. CONCLUSIONS The k-ctherm118 model captures significant metabolic changes caused by (1) nitrogen limitation leading to increased yields for lactate, pyruvate, and amino acids, and (2) ethanol stress causing an increase in intracellular sugar phosphate concentrations (~1.5-fold) due to upregulation of cofactor pools. Robustness analysis of k-ctherm118 alludes to the presence of a secondary activity of ketol-acid reductoisomerase and possible regulation by valine and/or leucine pool levels. In addition, cross-validation and robustness analysis allude to missing elements in k-ctherm118 and suggest additional experiments to improve kinetic model prediction fidelity. Overall, the study quantitatively assesses the advantages of EM-based kinetic modeling towards improved prediction of C. thermocellum metabolism and develops a predictive kinetic model which can be used to design biofuel-overproducing strains.
-
8.
SteadyCom: Predicting microbial abundances while ensuring community stability.
Chan, SHJ, Simons, MN, Maranas, CD
PLoS computational biology. 2017;(5):e1005539
Abstract
Genome-scale metabolic modeling has become widespread for analyzing microbial metabolism. Extending this established paradigm to more complex microbial communities is emerging as a promising way to unravel the interactions and biochemical repertoire of these omnipresent systems. While several modeling techniques have been developed for microbial communities, little emphasis has been placed on the need to impose a time-averaged constant growth rate across all members for a community to ensure co-existence and stability. In the absence of this constraint, the faster growing organism will ultimately displace all other microbes in the community. This is particularly important for predicting steady-state microbiota composition as it imposes significant restrictions on the allowable community membership, composition and phenotypes. In this study, we introduce the SteadyCom optimization framework for predicting metabolic flux distributions consistent with the steady-state requirement. SteadyCom can be rapidly converged by iteratively solving linear programming (LP) problem and the number of iterations is independent of the number of organisms. A significant advantage of SteadyCom is compatibility with flux variability analysis. SteadyCom is first demonstrated for a community of four E. coli double auxotrophic mutants and is then applied to a gut microbiota model consisting of nine species, with representatives from the phyla Bacteroidetes, Firmicutes, Actinobacteria and Proteobacteria. In contrast to the direct use of FBA, SteadyCom is able to predict the change in species abundance in response to changes in diets with minimal additional imposed constraints on the model. By randomizing the uptake rates of microbes, an abundance profile with a good agreement to experimental gut microbiota is inferred. SteadyCom provides an important step towards the cross-cutting task of predicting the composition of a microbial community in a given environment.
-
9.
Diurnal Regulation of Cellular Processes in the Cyanobacterium Synechocystis sp. Strain PCC 6803: Insights from Transcriptomic, Fluxomic, and Physiological Analyses.
Saha, R, Liu, D, Hoynes-O'Connor, A, Liberton, M, Yu, J, Bhattacharyya-Pakrasi, M, Balassy, A, Zhang, F, Moon, TS, Maranas, CD, et al
mBio. 2016;(3)
Abstract
UNLABELLED Synechocystis sp. strain PCC 6803 is the most widely studied model cyanobacterium, with a well-developed omics level knowledgebase. Like the lifestyles of other cyanobacteria, that of Synechocystis PCC 6803 is tuned to diurnal changes in light intensity. In this study, we analyzed the expression patterns of all of the genes of this cyanobacterium over two consecutive diurnal periods. Using stringent criteria, we determined that the transcript levels of nearly 40% of the genes in Synechocystis PCC 6803 show robust diurnal oscillating behavior, with a majority of the transcripts being upregulated during the early light period. Such transcripts corresponded to a wide array of cellular processes, such as light harvesting, photosynthetic light and dark reactions, and central carbon metabolism. In contrast, transcripts of membrane transporters for transition metals involved in the photosynthetic electron transport chain (e.g., iron, manganese, and copper) were significantly upregulated during the late dark period. Thus, the pattern of global gene expression led to the development of two distinct transcriptional networks of coregulated oscillatory genes. These networks help describe how Synechocystis PCC 6803 regulates its metabolism toward the end of the dark period in anticipation of efficient photosynthesis during the early light period. Furthermore, in silico flux prediction of important cellular processes and experimental measurements of cellular ATP, NADP(H), and glycogen levels showed how this diurnal behavior influences its metabolic characteristics. In particular, NADPH/NADP(+) showed a strong correlation with the majority of the genes whose expression peaks in the light. We conclude that this ratio is a key endogenous determinant of the diurnal behavior of this cyanobacterium. IMPORTANCE Cyanobacteria are photosynthetic microbes that use energy from sunlight and CO2 as feedstock. Certain cyanobacterial strains are amenable to facile genetic manipulation, thus enabling synthetic biology and metabolic engineering applications. Such strains are being developed as a chassis for the sustainable production of food, feed, and fuel. To this end, a holistic knowledge of cyanobacterial physiology and its correlation with gene expression patterns under the diurnal cycle is warranted. In this report, a genomewide transcriptional analysis of Synechocystis PCC 6803, the most widely studied model cyanobacterium, sheds light on the global coordination of cellular processes during diurnal periods. Furthermore, we found that, in addition to light, the redox level of NADP(H) is an important endogenous regulator of diurnal entrainment of Synechocystis PCC 6803.
-
10.
13C metabolic flux analysis at a genome-scale.
Gopalakrishnan, S, Maranas, CD
Metabolic engineering. 2015;:12-22
Abstract
Metabolic models used in 13C metabolic flux analysis generally include a limited number of reactions primarily from central metabolism. They typically omit degradation pathways, complete cofactor balances, and atom transition contributions for reactions outside central metabolism. This study addresses the impact on prediction fidelity of scaling-up mapping models to a genome-scale. The core mapping model employed in this study accounts for (75 reactions and 65 metabolites) primarily from central metabolism. The genome-scale metabolic mapping model (GSMM) (697 reaction and 595 metabolites) is constructed using as a basis the iAF1260 model upon eliminating reactions guaranteed not to carry flux based on growth and fermentation data for a minimal glucose growth medium. Labeling data for 17 amino acid fragments obtained from cells fed with glucose labeled at the second carbon was used to obtain fluxes and ranges. Metabolic fluxes and confidence intervals are estimated, for both core and genome-scale mapping models, by minimizing the sum of square of differences between predicted and experimentally measured labeling patterns using the EMU decomposition algorithm. Overall, we find that both topology and estimated values of the metabolic fluxes remain largely consistent between core and GSM model. Stepping up to a genome-scale mapping model leads to wider flux inference ranges for 20 key reactions present in the core model. The glycolysis flux range doubles due to the possibility of active gluconeogenesis, the TCA flux range expanded by 80% due to the availability of a bypass through arginine consistent with labeling data, and the transhydrogenase reaction flux was essentially unresolved due to the presence of as many as five routes for the inter-conversion of NADPH to NADH afforded by the genome-scale model. By globally accounting for ATP demands in the GSMM model the unused ATP decreased drastically with the lower bound matching the maintenance ATP requirement. A non-zero flux for the arginine degradation pathway was identified to meet biomass precursor demands as detailed in the iAF1260 model. Inferred ranges for 81% of the reactions in the genome-scale metabolic (GSM) model varied less than one-tenth of the basis glucose uptake rate (95% confidence test). This is because as many as 411 reactions in the GSM are growth coupled meaning that the single measurement of biomass formation rate locks the reaction flux values. This implies that accurate biomass formation rate and composition are critical for resolving metabolic fluxes away from central metabolism and suggests the importance of biomass composition (re)assessment under different genetic and environmental backgrounds. In addition, the loss of information associated with mapping fluxes from MFA on a core model to a GSM model is quantified.